Eric J. Daza, DrPH, MPS (Twitter @ericjdaza)
30 September 2021 Thursday
Understanding and shaping the health of populations requires both qualitative and quantitative scientific methods. Statistical concepts are used to structure and manage the uncertainty inherent in the scientific study of human life, behavior, and society. In this workshop, I’ll share some of my experiences as a biostatistician and digital health data scientist—all (well, let’s say 95%) using R.
“Confusing P-values with Clinical Impact: The Significance Fallacy”
Simulated/Synthetic Example: 12-week RCT of average treatment effect of two drugs on COVID-19 infection
tbl_perpid_fancy %>%
dplyr::select(
`Patient ID`,
Arm,
`Infection Status`
) %>%
dplyr::mutate(
`Patient ID` = `Patient ID` %>% as.character,
Arm = Arm %>% as.character,
`Infection Status` = `Infection Status` %>% as.character
) %>%
head %>%
dplyr::bind_rows(
dplyr::tibble(
`Patient ID` = "...",
Arm = "...",
`Infection Status` = "..."
)
) %>% knitr::kable(align = "c")| Patient ID | Arm | Infection Status |
|---|---|---|
| 1 | Feknuzison | Not Infected |
| 2 | Remdazavir | Not Infected |
| 3 | Remdazavir | Not Infected |
| 4 | Remdazavir | Not Infected |
| 5 | Feknuzison | Not Infected |
| 6 | Remdazavir | Not Infected |
| … | … | … |
tbl_perpid_fancy %>%
dplyr::select(
`Patient ID`,
Arm,
`Infection Status`
) %>%
dplyr::mutate(
`Patient ID` = `Patient ID` %>% as.character,
Arm = Arm %>% as.character,
`Infection Status` = `Infection Status` %>% as.character
) %>%
head %>%
dplyr::bind_rows(
dplyr::tibble(
`Patient ID` = "...",
Arm = "...",
`Infection Status` = "..."
)
) %>% knitr::kable(align = "c")| Patient ID | Arm | Infection Status |
|---|---|---|
| 1 | Feknuzison | Not Infected |
| 2 | Remdazavir | Not Infected |
| 3 | Remdazavir | Not Infected |
| 4 | Remdazavir | Not Infected |
| 5 | Feknuzison | Not Infected |
| 6 | Remdazavir | Not Infected |
| … | … | … |
| n.Feknuzison | n.Remdazavir | pct.Feknuzison | pct.Remdazavir | |
|---|---|---|---|---|
| Infected | 27 | 12 | 5.4 | 2.4 |
| Not Infected | 473 | 488 | 94.6 | 97.6 |
| n.Feknuzison | n.Remdazavir | pct.Feknuzison | pct.Remdazavir | |
|---|---|---|---|---|
| Infected | 27 | 12 | 5.4 | 2.4 |
| Not Infected | 473 | 488 | 94.6 | 97.6 |
| n.Feknuzison | n.Remdazavir | pct.Feknuzison | pct.Remdazavir | |
|---|---|---|---|---|
| Infected | 27 | 12 | 5.4 | 2.4 |
| Not Infected | 473 | 488 | 94.6 | 97.6 |
##
## Fisher's Exact Test for Count Data
##
## data: tbl_perpid_fancy$trt and tbl_perpid_fancy$infection
## p-value = 0.02107
## alternative hypothesis: true odds ratio is not equal to 1
## 95 percent confidence interval:
## 0.1965155 0.8923482
## sample estimates:
## odds ratio
## 0.4311286
logistic regression: \(\text{logit} \Pr(Y=1) = \beta_0 + \beta_1 X\)
logistic regression: \(\text{logit} \Pr(Y=1) = \beta_0 + \beta_1 X\)
##
## Call:
## glm(formula = infection ~ trt, family = binomial, data = .)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.3332 -0.3332 -0.2204 -0.2204 2.7312
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -2.8633 0.1979 -14.471 <2e-16 ***
## trt -0.8422 0.3529 -2.386 0.017 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 329.51 on 999 degrees of freedom
## Residual deviance: 323.35 on 998 degrees of freedom
## AIC: 327.35
##
## Number of Fisher Scoring iterations: 6
| Patient ID | Arm | Week | Infection Status at Week |
|---|---|---|---|
| 75 | Remdazavir | 1 | Not Infected |
| 75 | Remdazavir | 2 | Not Infected |
| 75 | Remdazavir | 3 | Not Infected |
| 75 | Remdazavir | 4 | Not Infected |
| 75 | Remdazavir | 5 | Not Infected |
| 75 | Remdazavir | 6 | Infected |
| 75 | Remdazavir | 7 | Infected |
| 75 | Remdazavir | 8 | Infected |
| 75 | Remdazavir | 9 | Infected |
| 75 | Remdazavir | 10 | Infected |
| 75 | Remdazavir | 11 | Infected |
| 75 | Remdazavir | 12 | Infected |
| … | … | … | … |
Daza EJ, Wac K, Oppezzo M. Effects of Sleep Deprivation on Blood Glucose, Food Cravings, and Affect in a Non-Diabetic: An N-of-1 Randomized Pilot Study. Healthcare 2020 Mar (Vol. 8, No. 1, p. 6). Multidisciplinary Digital Publishing Institute.
Daza EJ, Wac K, Oppezzo M. Effects of Sleep Deprivation on Blood Glucose, Food Cravings, and Affect in a Non-Diabetic: An N-of-1 Randomized Pilot Study. Healthcare 2020 Mar (Vol. 8, No. 1, p. 6). Multidisciplinary Digital Publishing Institute.
I originally created these slides for the 2020 Pilipinx American Public Health Conference (PAPHC). I thank PAPHC for motivating this presentation, and providing coordination and technical support. This presentation is independent of my work for Evidation Health, where I am a full-time employee.
Dr. Daza is a data science statistician and digital health data scientist who develops causal inference methods for personal (n-of-1) digital health. He works at Evidation Health (evidation.com). “Significance of evidence is not evidence of significance.” (tinyurl.com/94dc9vn5)
🤓😎🇵🇭
Thank you! Maraming Salamat!
😃😄🙏
Questions?